Regret of Age-of-Information Bandits

نویسندگان

چکیده

We consider a system with single source that measures/tracks time-varying quantity and periodically attempts to report these measurements monitoring station. Each update from the has be scheduled on one of $K$ available communication channels. The probability success each attempted is function channel used. This unknown scheduler. metric interest Age-of-Information (AoI), formally defined as time elapsed since destination received recent most source. model our scheduling problem variant multi-arm bandit channels arms. characterize lower bound AoI regret achievable by any policy performance UCB, Thompson Sampling, their variants. Our analytical results show UCB sampling are order-optimal for bandits. In addition, we propose novel policies which, unlike use current make decisions. Via simulations, proposed AoI-aware outperform existing AoI-agnostic policies.

برای دانلود باید عضویت طلایی داشته باشید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Regret of Queueing Bandits

We consider a variant of the multiarmed bandit problem where jobs queue for service, and service rates of different servers may be unknown. We study algorithms that minimize queue-regret: the (expected) difference between the queue-lengths obtained by the algorithm, and those obtained by a “genie”-aided matching algorithm that knows exact service rates. A naive view of this problem would sugges...

متن کامل

Dueling Bandits with Weak Regret

We consider online content recommendation with implicit feedback through pairwise comparisons, formalized as the so-called dueling bandit problem. We study the dueling bandit problem in the Condorcet winner setting, and consider two notions of regret: the more well-studied strong regret, which is 0 only when both arms pulled are the Condorcet winner; and the less well-studied weak regret, which...

متن کامل

Bayesian Bandits, Secretaries, and Vanishing Computational Regret

We consider the finite-horizon multi-armed bandit problem under the standard stochastic assumption of independent priors over the reward distributions of the arms. We define a new notion of computational regret against the Bayesian optimum solution instead of worst-case against the true underlying distributions. We show that when the priors of the arms satisfy a log-concavity condition, there i...

متن کامل

Regret Bounds for Deterministic Gaussian Process Bandits

This paper analyzes the problem of Gaussian process (GP) bandits with deterministic observations. The analysis uses a branch and bound algorithm that is related to the UCB algorithm of (Srinivas et al., 2010). For GPs with Gaussian observation noise, with variance strictly greater than zero, (Srinivas et al., 2010) proved that the regret vanishes at the approximate rate of O ( 1 √ t ) , where t...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: IEEE Transactions on Communications

سال: 2022

ISSN: ['1558-0857', '0090-6778']

DOI: https://doi.org/10.1109/tcomm.2021.3118037